296 research outputs found

    Online Attacks on Picture Owner Privacy

    Get PDF
    International audienceWe present an online attribute inference attack by leverag-ing Facebook picture metadata (i) alt-text generated by Facebook to describe picture contents, and (ii) comments containing words and emo-jis posted by other Facebook users. Specifically, we study the correlation of the picture's owner with Facebook generated alt-text and comments used by commenters when reacting to the image. We concentrate on gender attribute that is highly relevant for targeted advertising or privacy breaking. We explore how to launch an online gender inference attack on any Facebook user by handling online newly discovered vocabulary using the retrofitting process to enrich a core vocabulary built during offline training. Our experiments show that even when the user hides most public data (e.g., friend list, attribute, page, group), an attacker can detect user gender with AUC (area under the ROC curve) from 87% to 92%, depending on the picture metadata availability. Moreover, we can detect with high accuracy sequences of words leading to gender disclosure, and accordingly, enable users to derive countermeasures and configure their privacy settings safely

    Semantic Sentiment Analysis of Twitter Data

    Full text link
    Internet and the proliferation of smart mobile devices have changed the way information is created, shared, and spreads, e.g., microblogs such as Twitter, weblogs such as LiveJournal, social networks such as Facebook, and instant messengers such as Skype and WhatsApp are now commonly used to share thoughts and opinions about anything in the surrounding world. This has resulted in the proliferation of social media content, thus creating new opportunities to study public opinion at a scale that was never possible before. Naturally, this abundance of data has quickly attracted business and research interest from various fields including marketing, political science, and social studies, among many others, which are interested in questions like these: Do people like the new Apple Watch? Do Americans support ObamaCare? How do Scottish feel about the Brexit? Answering these questions requires studying the sentiment of opinions people express in social media, which has given rise to the fast growth of the field of sentiment analysis in social media, with Twitter being especially popular for research due to its scale, representativeness, variety of topics discussed, as well as ease of public access to its messages. Here we present an overview of work on sentiment analysis on Twitter.Comment: Microblog sentiment analysis; Twitter opinion mining; In the Encyclopedia on Social Network Analysis and Mining (ESNAM), Second edition. 201

    Robust Domain Adaptation Approach for Tweet Classification for Crisis Response

    Get PDF
    Information posted by people on Twitter during crises can significantly improve crisis response towards reducing human and financial loss. Deep learning algorithms can identify related tweets to reduce information overloaded which prevents humanitarian organizations from using Twitter posts. However, they heavily rely on labeled data which is unavailable for emerging crises. And because each crisis has its own features such as location, occurring time and social media response, current models are known to suffer from generalizing to unseen disaster events when pretrained on past ones. To solve this problem, we propose a domain adaptation approach that makes use of a distant supervision-based framework to label the unlabeled data from emerging crises. Then, pseudo-labeled target data, along with labeled-data from similar past disasters, are used to build the target model. Our results show that our approach can be seen as a general robust method to classify unseen tweets from current events

    Niche as a determinant of word fate in online groups

    Get PDF
    Patterns of word use both reflect and influence a myriad of human activities and interactions. Like other entities that are reproduced and evolve, words rise or decline depending upon a complex interplay between {their intrinsic properties and the environments in which they function}. Using Internet discussion communities as model systems, we define the concept of a word niche as the relationship between the word and the characteristic features of the environments in which it is used. We develop a method to quantify two important aspects of the size of the word niche: the range of individuals using the word and the range of topics it is used to discuss. Controlling for word frequency, we show that these aspects of the word niche are strong determinants of changes in word frequency. Previous studies have already indicated that word frequency itself is a correlate of word success at historical time scales. Our analysis of changes in word frequencies over time reveals that the relative sizes of word niches are far more important than word frequencies in the dynamics of the entire vocabulary at shorter time scales, as the language adapts to new concepts and social groupings. We also distinguish endogenous versus exogenous factors as additional contributors to the fates of words, and demonstrate the force of this distinction in the rise of novel words. Our results indicate that short-term nonstationarity in word statistics is strongly driven by individual proclivities, including inclinations to provide novel information and to project a distinctive social identity.Comment: Supporting Information is available here: http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0019009.s00

    Microencapsulated foods as a functional delivery vehicle for omega-3 fatty acids: a pilot study

    Get PDF
    It is well established that the ingestion of the omega-3 (N3) fatty acids docosahexaenoic acid (DHA) and eicosapentaenoic acid (EPA) positively benefit a variety of health indices. Despite these benefits the actual intake of fish derived N3 is relatively small in the United States. The primary aim of our study was to examine a technology capable of delivering omega-3 fatty acids in common foods via microencapsulation (MicroN3) in young, healthy, active participants who are at low risk for cardiovascular disease. Accordingly, we randomized 20 participants (25.4 ± 6.2 y; 73.4 ± 5.1 kg) to receive the double blind delivery of a placebo-matched breakfast meal (~2093 kJ) containing MicroN3 (450–550 mg EPA/DHA) during a 2-week pilot trial. Overall, we observed no differences in overall dietary macronutrient intake other than the N3 delivery during our treatment regimen. Post-test ANOVA analysis showed a significant elevation in mean (SE) plasma DHA (91.18 ± 9.3 vs. 125.58 ± 11.3 umol/L; P < 0.05) and a reduction in triacylglycerols (89.89 ± 12.8 vs. 80.78 ± 10.4 mg/dL; P < 0.05) accompanying the MicroN3 treatment that was significantly different from placebo (P < 0.05). In post study interviews, participants reported that the ingested food was well-tolerated, contained no fish taste, odor or gastrointestinal distress accompanying treatment. The use of MicroN3 foods provides a novel delivery system for the delivery of essential fatty acids. Our study demonstrates that MicroN3 foods promote the absorption of essential N3, demonstrate bioactivity within 2 weeks of ingestion and are well tolerated in young, active participants who are at low risk for cardiovascular disease

    Incorporating rich background knowledge for gene named entity classification and recognition

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene named entity classification and recognition are crucial preliminary steps of text mining in biomedical literature. Machine learning based methods have been used in this area with great success. In most state-of-the-art systems, elaborately designed lexical features, such as words, n-grams, and morphology patterns, have played a central part. However, this type of feature tends to cause extreme sparseness in feature space. As a result, out-of-vocabulary (OOV) terms in the training data are not modeled well due to lack of information.</p> <p>Results</p> <p>We propose a general framework for gene named entity representation, called feature coupling generalization (FCG). The basic idea is to generate higher level features using term frequency and co-occurrence information of highly indicative features in huge amount of unlabeled data. We examine its performance in a named entity classification task, which is designed to remove non-gene entries in a large dictionary derived from online resources. The results show that new features generated by FCG outperform lexical features by 5.97 F-score and 10.85 for OOV terms. Also in this framework each extension yields significant improvements and the sparse lexical features can be transformed into both a lower dimensional and more informative representation. A forward maximum match method based on the refined dictionary produces an F-score of 86.2 on BioCreative 2 GM test set. Then we combined the dictionary with a conditional random field (CRF) based gene mention tagger, achieving an F-score of 89.05, which improves the performance of the CRF-based tagger by 4.46 with little impact on the efficiency of the recognition system. A demo of the NER system is available at <url>http://202.118.75.18:8080/bioner</url>.</p

    Automatic term identification for bibliometric mapping

    Get PDF
    A term map is a map that visualizes the structure of a scientific field by showing the relations between important terms in the field. The terms shown in a term map are usually selected manually with the help of domain experts. Manual term selection has the disadvantages of being subjective and labor-intensive. To overcome these disadvantages, we propose a methodology for automatic term identification and we use this methodology to select the terms to be included in a term map. To evaluate the proposed methodology, we use it to construct a term map of the field of operations research. The quality of the map is assessed by a number of operations research experts. It turns out that in general the proposed methodology performs quite well

    CMB Telescopes and Optical Systems

    Full text link
    The cosmic microwave background radiation (CMB) is now firmly established as a fundamental and essential probe of the geometry, constituents, and birth of the Universe. The CMB is a potent observable because it can be measured with precision and accuracy. Just as importantly, theoretical models of the Universe can predict the characteristics of the CMB to high accuracy, and those predictions can be directly compared to observations. There are multiple aspects associated with making a precise measurement. In this review, we focus on optical components for the instrumentation used to measure the CMB polarization and temperature anisotropy. We begin with an overview of general considerations for CMB observations and discuss common concepts used in the community. We next consider a variety of alternatives available for a designer of a CMB telescope. Our discussion is guided by the ground and balloon-based instruments that have been implemented over the years. In the same vein, we compare the arc-minute resolution Atacama Cosmology Telescope (ACT) and the South Pole Telescope (SPT). CMB interferometers are presented briefly. We conclude with a comparison of the four CMB satellites, Relikt, COBE, WMAP, and Planck, to demonstrate a remarkable evolution in design, sensitivity, resolution, and complexity over the past thirty years.Comment: To appear in: Planets, Stars and Stellar Systems (PSSS), Volume 1: Telescopes and Instrumentatio

    Evolutionary conservation of lampbrush-like loops in drosophilids

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Loopin-1 is an abundant, male germ line specific protein of <it>Drosophila melanogaster</it>. The polyclonal antibody T53-F1 specifically recognizes Loopin-1 and enables its visualization on the Y-chromosome lampbrush-like loop named kl-3 during primary spermatocyte development, as well as on sperm tails. In order to test lampbrush-like loop evolutionary conservation, extensive phase-contrast microscopy and immunostaining with T53-F1 antibody was performed in other drosophilids scattered along their genealogical tree.</p> <p>Results</p> <p>In the male germ line of all species tested there are cells showing giant nuclei and intranuclear structures similar to those of <it>Drosophila melanogaster </it>primary spermatocytes. Moreover, the antibody T53-F1 recognizes intranuclear structures in primary spermatocytes of all drosophilids analyzed. Interestingly, the extent and conformation of the staining pattern is species-specific. In addition, the intense staining of sperm tails in all species suggests that the terminal localization of Loopin-1 and its orthologues is conserved. A comparison of these cytological data and the data coming from the literature about sperm length, amount of sperm tail entering the egg during fertilization, shape and extent of both loops and primary spermatocyte nuclei, seems to exclude direct relationships among these parameters.</p> <p>Conclusion</p> <p>Taken together, the data reported strongly suggest that lampbrush-like loops are a conserved feature of primary spermatocyte nuclei in many, if not all, drosophilids. Moreover, the conserved pattern of the T53-F1 immunostaining indicates that a Loopin-1-like protein is present in all the species analyzed, whose localization on lampbrush-like loops and sperm tails during spermatogenesis is evolutionary conserved.</p
    corecore